731 research outputs found
A class of nonsymmetric preconditioners for saddle point problems
For iterative solution of saddle point problems, a nonsymmetric preconditioning is studied which, with respect to the upper-left block of the system matrix, can be seen as a variant of SSOR. An idealized situation where the SSOR is taken with respect to the skew-symmetric part plus the diagonal part of the upper-left block is analyzed in detail. Since action of the preconditioner involves solution of a Schur complement system, an inexact form of the preconditioner can be of interest. This results in an inner-outer iterative process. Numerical experiments with solution of linearized Navier-Stokes equations demonstrate efficiency of the new preconditioner, especially when the left-upper block is far from symmetric
Closer to the solutions: iterative linear solvers
The solution of dense linear systems received much attention after the second world war, and by the end of the sixties, most of the problems associated with it had been solved. For a long time, Wilkinson's \The Algebraic Eigenvalue Problem" [107], other than the title suggests, became also the standard textbook for the solution of linear systems. When it became clear that partial dierential equations could be solved numerically, to a level of accuracy that was of interest for application areas (such as reservoir engineering, and reactor diusion modeling), there was a strong need for the fast solution of the discretized systems, and iterative methods became popular for these problems
A note on the error analysis of classical Gram-Schmidt
An error analysis result is given for classical Gram--Schmidt factorization
of a full rank matrix into where is left orthogonal (has
orthonormal columns) and is upper triangular. The work presented here shows
that the computed satisfies \normal{R}=\normal{A}+E where is an
appropriately small backward error, but only if the diagonals of are
computed in a manner similar to Cholesky factorization of the normal equations
matrix.
A similar result is stated in [Giraud at al, Numer. Math.
101(1):87--100,2005]. However, for that result to hold, the diagonals of
must be computed in the manner recommended in this work.Comment: 12 pages This v2. v1 (from 2006) has not the biliographical reference
set (at all). This is the only modification between v1 and v2. If you want to
quote this paper, please quote the version published in Numerische Mathemati
Assessing the reliability of ensemble forecasting systems under serial dependence
The problem of testing the reliability of ensemble forecasting systems is revisited. A popular tool to assess the reliability of ensemble forecasting systems (for scalar verifications) is the rank histogram; this histogram is expected to be more or less flat, since for a reliable ensemble, the ranks are uniformly distributed among their possible outcomes. Quantitative tests for flatness (e.g. Pearson's goodness–of–fit test) have been suggested; without exception though, these tests assume the ranks to be a sequence of independent random variables, which is not the case in general as can be demonstrated with simple toy examples. In this paper, tests are developed that take the temporal correlations between the ranks into account. A refined analysis exploiting the reliability property shows that the ranks still exhibit strong decay of correlations. This property is key to the analysis, and the proposed tests are valid for general ensemble forecasting systems with minimal extraneous assumptions
Increasing the power of genome wide association studies in natural populations using repeated measures - evaluation and implementation
1. Genomewide association studies (GWAS) enable detailed dissections of the genetic basis for organisms' ability to adapt to a changing environment. In long-term studies of natural populations, individuals are often marked at one point in their life and then repeatedly recaptured. It is therefore essential that a method for GWAS includes the process of repeated sampling. In a GWAS, the effects of thousands of single-nucleotide polymorphisms (SNPs) need to be fitted and any model development is constrained by the computational requirements. A method is therefore required that can fit a highly hierarchical model and at the same time is computationally fast enough to be useful. 2. Our method fits fixed SNP effects in a linear mixed model that can include both random polygenic effects and permanent environmental effects. In this way, the model can correct for population structure and model repeated measures. The covariance structure of the linear mixed model is first estimated and subsequently used in a generalized least squares setting to fit the SNP effects. The method was evaluated in a simulation study based on observed genotypes from a long-term study of collared flycatchers in Sweden. 3. The method we present here was successful in estimating permanent environmental effects from simulated repeated measures data. Additionally, we found that especially for variable phenotypes having large variation between years, the repeated measurements model has a substantial increase in power compared to a model using average phenotypes as a response. 4. The method is available in the R package RepeatABEL. It increases the power in GWAS having repeated measures, especially for long-term studies of natural populations, and the R implementation is expected to facilitate modelling of longitudinal data for studies of both animal and human populations.Peer reviewe
Parallelization of the QR Decomposition with Column Pivoting Using Column Cyclic Distribution on Multicore and GPU Processors
The QR decomposition with column pivoting (QRP) of a matrix is widely used for rank revealing. The performance of LAPACK implementation (DGEQP3) of the Householder QRP algorithm is limited by Level 2 BLAS operations required for updating the column norms. In this paper, we propose an implementation of the QRP algorithm using a distribution of the matrix columns in a round-robin fashion for better data locality and parallel memory bus utilization on multicore architectures. Our performance results show a 60% improvement over the routine DGEQP3 of Intel MKL (version 10.3) on a 12 core Intel Xeon X5670 machine. In addition, we show that the same data distribution is also suitable for general purpose GPU processors, where our implementation obtains up to 90 GFlops on a NVIDIA GeForce GTX480. This is about 2 times faster than the QRP implementation of MAGMA (version 1.2.1).Tom ́as and Bai were supported in part by the U.S. DOES ciDAC grant DOE-DE-FC0206ER25793 and NSF grant PHY1005502. This research used resources of the National Energy Research Scientific Computing Center, which is supported by the Office of Science of the U.S. DOE under Contract No. DE-AC02-05CH11231.Tomás Domínguez, AE.; Bai, Z.; Hernández García, V. (2013). Parallelization of the QR Decomposition with Column Pivoting Using Column Cyclic Distribution on Multicore and GPU Processors. En High Performance Computing for Computational Science - VECPAR 2012. Springer Verlag (Germany): Series. 50-58. https://doi.org/10.1007/978-3-642-38718-0_8S5058Bischof, C.H.: A parallel QR factorization algorithm with controlled local pivoting. SIAM J. Sci. Stat. Comput. 12, 36–57 (1991)Chandrasekaran, S., Ipsen, I.C.F.: On rank-revealing factorisations. SIAM J. Matrix Anal. Appl. 15, 592–622 (1994)Castaldo, A.M., Whaley, R.C.: Scaling LAPACK panel operations using parallel cache assignment. In: 15th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, pp. 223–231 (2010)Drmač, Z., Bujanović, Z.: On the failure of rank-revealing QR factorization software – a case study. ACM Trans. Math. Softw. 35, 12:1–12:28 (2008)Drmač, Z., Veselić, K.: New fast and accurate Jacobi SVD algorithm I. SIAM J. Matrix Anal. Appl. 29, 1322–1342 (2008)Drmač, Z., Veselić, K.: New fast and accurate Jacobi SVD algorithm II. SIAM J. Matrix Anal. Appl. 29, 1343–1362 (2008)Golub, G.H.: Numerical methods for solving linear least squares problems. Numer. Math. 7, 206–216 (1965)Gu, M., Eisenstat, S.: Efficient algorithms for computing a strong rank-revealing QR factorization. SIAM J. Sci. Comput. 17, 848–869 (1996)Quintana-Orti, G., Sun, X., Bischof, C.H.: A BLAS-3 version of the QR factorization with column pivoting. SIAM J. Sci. Comput. 19, 1486–1494 (1998)Schreiber, R., van Loan, C.: A storage-efficient WY representation for products of Householder transformations. SIAM J. Sci. Stat. Comput. 10, 53–57 (1989
Transfinite thin plate spline interpolation
Duchon's method of thin plate splines defines a polyharmonic interpolant to
scattered data values as the minimizer of a certain integral functional. For
transfinite interpolation, i.e. interpolation of continuous data prescribed on
curves or hypersurfaces, Kounchev has developed the method of polysplines,
which are piecewise polyharmonic functions of fixed smoothness across the given
hypersurfaces and satisfy some boundary conditions. Recently, Bejancu has
introduced boundary conditions of Beppo Levi type to construct a semi-cardinal
model for polyspline interpolation to data on an infinite set of parallel
hyperplanes. The present paper proves that, for periodic data on a finite set
of parallel hyperplanes, the polyspline interpolant satisfying Beppo Levi
boundary conditions is in fact a thin plate spline, i.e. it minimizes a Duchon
type functional
Quantum Error Correction via Convex Optimization
We show that the problem of designing a quantum information error correcting
procedure can be cast as a bi-convex optimization problem, iterating between
encoding and recovery, each being a semidefinite program. For a given encoding
operator the problem is convex in the recovery operator. For a given method of
recovery, the problem is convex in the encoding scheme. This allows us to
derive new codes that are locally optimal. We present examples of such codes
that can handle errors which are too strong for codes derived by analogy to
classical error correction techniques.Comment: 16 page
Spectrum of the Dirac Operator and Multigrid Algorithm with Dynamical Staggered Fermions
Complete spectra of the staggered Dirac operator \Dirac are determined in
quenched four-dimensional gauge fields, and also in the presence of
dynamical fermions.
Periodic as well as antiperiodic boundary conditions are used.
An attempt is made to relate the performance of multigrid (MG) and conjugate
gradient (CG) algorithms for propagators with the distribution of the
eigenvalues of~\Dirac.
The convergence of the CG algorithm is determined only by the condition
number~ and by the lattice size.
Since~'s do not vary significantly when quarks become dynamic,
CG convergence in unquenched fields can be predicted from quenched
simulations.
On the other hand, MG convergence is not affected by~ but depends on
the spectrum in a more subtle way.Comment: 19 pages, 8 figures, HUB-IEP-94/12 and KL-TH 19/94; comes as a
uuencoded tar-compressed .ps-fil
Articulated Model Registration of MRI/X-Ray Spine Data
Collection : Lecture Notes in Computer Science ; vol. 6112This paper presents a method based on articulated models for the registration of spine data extracted from multimodal medical images of patients with scoliosis. With the ultimate aim being the development of a complete geometrical model of the torso of a scoliotic patient, this work presents a method for the registration of vertebral column data using 3D magnetic resonance images (MRI) acquired in prone position and X-ray data acquired in standing position for five patients with scoliosis. The 3D shape of the vertebrae is estimated from both image modalities for each patient, and an articulated model is used in order to calculate intervertebral transformations required in order to align the vertebrae between both postures. Euclidean distances between anatomical landmarks are calculated in order to assess multimodal registration error. Results show a decrease in the Euclidean distance using the proposed method compared to rigid registration and more physically realistic vertebrae deformations compared to thin-plate-spline (TPS) registration thus improving alignment.IRS
- …